Coordinating multi-agent reinforcement learning with limited communication

نویسندگان

  • Chongjie Zhang
  • Victor R. Lesser
چکیده

Coordinated multi-agent reinforcement learning (MARL) provides a promising approach to scaling learning in large cooperative multiagent systems. Distributed constraint optimization (DCOP) techniques have been used to coordinate action selection among agents during both the learning phase and the policy execution phase (if learning is off-line) to ensure good overall system performance. However, running DCOP algorithms for each action selection through the whole system results in significant communication among agents, which is not practical for most applications with limited communication bandwidth. In this paper, we develop a learning approach that generalizes previous coordinated MARL approaches that use DCOP algorithms and enables MARL to be conducted over a spectrum from independent learning (without communication) to fully coordinated learning depending on agents’ communication bandwidth. Our approach defines an interaction measure that allows agents to dynamically identify their beneficial coordination set (i.e., whom to coordinate with) in different situations and to trade off its performance and communication cost. By limiting their coordination set, agents dynamically decompose the coordination network in a distributed way, resulting in dramatically reduced communication for DCOP algorithms without significantly affecting overall learning performance. Essentially, our learning approach conducts co-adaptation of agents’ policy learning and coordination set identification, which outperforms approaches that sequence them.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Voltage Coordination of FACTS Devices in Power Systems Using RL-Based Multi-Agent Systems

This paper describes how multi-agent system technology can be used as the underpinning platform for voltage control in power systems. In this study, some FACTS (flexible AC transmission systems) devices are properly designed to coordinate their decisions and actions in order to provide a coordinated secondary voltage control mechanism based on multi-agent theory. Each device here is modeled as ...

متن کامل

Learning with Whom to Communicate Using Relational Reinforcement Learning

Relational reinforcement learning is a promising direction within reinforcement learning research. It upgrades reinforcement learning techniques by using relational representations for states, actions, and learned value-functions or policies to allow natural representations and abstractions of complex tasks. Multiagent systems are characterized by their relational structure and present a good e...

متن کامل

Learning Complex Swarm Behaviors by Exploiting Local Communication Protocols with Deep Reinforcement Learning

Swarm systems constitute a challenging problem for reinforcement learning (RL) as the algorithm needs to learn decentralized control policies that can cope with limited local sensing and communication abilities of the agents. Although there have been recent advances of deep RL algorithms applied to multi-agent systems, learning communication protocols while simultaneously learning the behavior ...

متن کامل

Coordination in multiagent reinforcement learning systems by virtual reinforcement signals

This paper presents a novel method for on-line coordination in multiagent reinforcement learning systems. In this method a reinforcement-learning agent learns to select its action estimating system dynamics in terms of both the natural reward for task achievement and the virtual reward for cooperation. The virtual reward for cooperation is ascertained dynamically by a coordinating agent who est...

متن کامل

Indeterminacy Reduction in Agent Communication Using a Semantic Language

In recent years, the importance of vagueness and uncertainty in the messages exchanged between agents has been highlighted mainly due to the ubiquitous nature of the (artificial or human) agents’ communication. The imprecision in the communication becomes more significant when the autonomy of the agents increases or the number of exchanged messages for a communicative goal is limited. In this p...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013